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The celebrated de la Garza phenomenon states that for a polyno- 
mial regression model of degree p — 1 any optimal design can be based 
00 ' on at most p design points. In a remarkable paper, Yang [Ann. Statist. 

38 (2010) 2499-2524] showed that this phenomenon exists in many 
locally optimal design problems for nonlinear models. In the present 
note, we present a different view point on these findings using results 
about moment theory and Chebyshev systems. In particular, we show 
that this phenomenon occurs in an even larger class of models than 
*Jm considered so far. 

1. Introduction. Nonlinear regression models are widely used for mod- 
eling dependencies between response and explanatory variables [see Seber 
and Wild (1989) or Ratkowsky (1990)]. It is well known that an appropri- 

■ ate choice of an experimental design can improve the quality of statistical 
analysis substantially, and therefore the problem of constructing optimal 

• designs for nonlinear regression models has found considerable attention in 

CO . the literature. Most authors concentrate on locally optimal designs which 

I/-) | assume that a guess for the unknown parameters of the model is available 

[see Chernoff (1953), Ford, Torsney and Wu (1992), He, Studden and Sun 
(1996), Fang and Hedayat (2008)]. These designs are usually used as bench- 
marks for commonly used designs. Additionally, they serve as a basis for 
constructing optimal designs with respect to more sophisticated optimality 
^ \ criteria which address for a less precise knowledge about the unknown pa- 

^ ■ rameters [see Pronzato and Walter (1985) or Chaloner and Verdinelli (1995), 
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Dette (1997), Miiller and Pazman (1998)]. It is a well-known fact that the 
numerical or analytical calculation of optimal designs simplifies substantially 
if it is known that the optimal design is saturated, which means that the 
number of different experimental conditions coincides with the number of 
parameters in the model [see, e.g., He, Studden and Sun (1996), Dette and 
Wong (1996), Imhof and Studden (2001), Imhof (2001), Melas (2006), Fang 
and Hedayat (2008) among many others]. 

So, the ideal situation appears if the optimal design is in the sub-class of 
all saturated designs. In a celebrated paper, de la Garza (1954) proved that 
for a (p — l)th-degree polynomial regression model, any optimal design can 
be based on at most p points. Khuri et al. (2006) considered a nonlinear 
regression model and introduced the terminology of the de la Garza phe- 
nomenon, which means that for any design there exists a saturated design, 
such that the information matrix of the saturated design is not inferior to 
that of the given design under the Loewner ordering. In a remarkable paper, 
Yang (2010) derived sufficient conditions on the nonlinear regression model 
for the occurrence of the de la Garza phenomenon and demonstrated that 
this situation appears in a broad class of nonlinear regression models. These 
results generalize recent findings of Yang and Stufken (2009) for nonlinear 
models with two parameters. 

However, some care is necessary if these results are applied as indicated 
in the following simple example of homoscedastic linear regression on the 
interval [0, 1]. Here the information matrix of the design which advises the 
experimenter to take all n observations at the point is given by 

' n s 







while any other design (using the experimental conditions x\, . . . , x n ) yields 
an information matrix 



n > Xi 1 



X 2 X2 



i=l 

n 

2 



\ 1=1 j=l / 



It is easy to see that the matrix X 2 X2 — XfXi is indefinite (i.e., it has 
positive and negative eigenvalues) whenever one of the xi is positive. Con- 
sequently, the design corresponding to XfX\ cannot be improved. On the 
other hand, it is also easy to see that for any k £ {1, . . . , [n/2\ — 1} the 
information matrix of the design, which takes observations at x± = ■ ■ ■ = 
x n-2k = and at x n _2k+i = • ■ • = x n = 1/2 can be improved (with respect 
to the Loewner ordering) by the information matrix corresponding to the 
design x\ = ■ ■ ■ = x n _^ = and x n ^k+i = • - • = x n = l. Thus, there exist de- 
signs where a "real" improvement is possible, while other designs cannot be 
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improved. Note that the results in Yang (2010) do not provide a classification 
of the two types of designs. 

It is the purpose of the present paper to present a more detailed view point 
on these problems, which clarifies this — on a first glance — contradiction. In 
contrast to the method used by Yang (2010), which is mainly algebraic, 
our approach is analytic and based on the theory of Chebyshev systems 
and moment spaces [see Karlin and Studden (1966b)]. In particular, we will 
demonstrate that the de la Garza phenomenon appears in any nonlinear 
regression model, where the functions in the Fisher information matrix form 
a Chebyshev system. Additionally, we will solve the problem described in the 
previous paragraph and we will identify the sufficient conditions stated in 
Yang (2010) as a special case of an extended Chebyshev system. Therefore, 
our results generalize the recent findings of Yang (2010) in a nontrivial way 
and, additionally, provide — in our opinion — a more transparent and more 
complete explanation of the de la Garza phenomenon for optimal designs in 
nonlinear regression models. 

The remaining part of this paper is organized as follows. Section 2 provides 
a brief introduction in the problem, while Section 3 contains our main results. 
Finally, the new results are illustrated in a rational regression model, where 
the currently available methodology cannot be used to establish the de la 
Garza phenomenon. 

2. Locally optimal designs. Consider the common nonlinear regression 
model 

(2.1) Y = 7](x, 9) + e, 

where 9 6 O C W is the vector of unknown parameters, and different obser- 
vations are assumed to be independent. The errors are normally distributed 
with mean and variance a 2 . The variable x denotes the explanatory vari- 
able, which varies in the design space [A, B] C M. We assume that r\ is a 
continuous and real valued function of both arguments (x, 9) € [A, B] x 
and differ entiable with respect to the variable 9. A design is defined as 
a probability measure £ on the interval [A, B] with finite support [see Kiefer 
(1974)]. If the design £ has masses Wi at the points Xi (i = 1, . . . ,k) and n 
observations can be made by the experimenter, this means that the quan- 
tities Win are rounded to integers, say rn, satisfying Yli=i n i = n ; an d the 
experimenter takes rii observations at each location X{ (i = 1, . . . , k). The 
information matrix of an approximate design £ is defined by 

(2.2) M(£, 9) = (j^nix, 9)) f-^ V (x, 0)) T , 

and it is well known [see Jennrich (1969)] that under appropriate assump- 
tions of regularity the covariance matrix of the least squares estimator is 
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approximately given by a 2 M _1 (£,#)/n, where n denotes the total sam- 
ple size and we assume that the observations are taken according to the 
approximate design £. 

An optimal design maximizes an appropriate functional of the informa- 
tion matrix and numerous criteria have been proposed in the literature to 
discriminate between competing designs [see Silvey (1980), Pazman (1986) 
or Pukelsheim (2006) among others]. Note that in nonlinear regression mod- 
els the information matrix (and as a consequence the corresponding optimal 
designs) depend on the unknown parameters and are therefore called locally 
optimal designs [see Chernoff (1953)]. These designs require an initial guess 
of the unknown parameters in the model and are used as benchmarks for 
many commonly used designs. 

Most of the available optimality criteria satisfy a monotonicity property 
with respect to the Loewner ordering, that is 

(2.3) M(£i,0)<M(&,0) 3KM(£i,0))<cl>(M(£ 2 ,0)), 

where the parameter 9 is fixed, £1,^2 are two competing designs and <J? 
denotes an information function in the sense of Pukelsheim (2006). For this 
reason, it is of interest to derive a complete class theorem in this general 
context which characterizes the class of designs, which cannot be improved 
with respect to the Loewner ordering of their information matrices. We 
call a design £1 admissible if there does not exist a design £2, such that 
M(£i,0)^M(6,0) and 

(2.4) M(a,0)<M(£ 2 ,0). 

As pointed out in Yang (2010) for many nonlinear regression models the 
information matrix defined in (2.2) has a representation of the form 

(2.5) M(Z,9) = P(8)C(H,0)P T (0), 

where P(9) is a nonsingular p x p matrix, which does not depend on the 
design £, the matrix C is defined by 

»B r-B 



(2.6) C{U) 



( [ 9 n (x)d$(x) ■■■ [ * lp (x)d$(x)\ 

J A J A 

r-B r-B 

yj^ $ pl ( x )dit(x) ... j^^ pp {x)di{x) j 



and ^n, x I'i2, • • • , typp are functions defined on the interval [A, B]. Note that 
these functions usually depend on the parameter 8, but for the sake of sim- 
plicity we do not reflect this dependence in our notation. Obviously the 
inequality (2.4) is satisfied if and only if the inequality 

(2.7) C(&,0)<C(£ 2 ,0) 
is satisfied. 
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3. Chebyshev systems and complete class theorems. In the following 
discussion, we make extensive use of the property that a system of functions 
has the Chebyshev property. Following Karlin and Studden (1966b), a set 



of k + 1 continuous functions uq, . . . , Uk : [A, B] 
system (on the interval LA, B]) if the inequality 



(3.1) 



u (x ) 
ui(x ) 



uq(xi) 

m(xi) 



uo{x k ) 
ui(x k ) 



Uk(xo) 

holds for all A < xq < x\ < • 



is called a Chebyshev 



>0 



u k {xi) ... u k {x k ) 
<Xk<B. Note that if the determinant in (3.1) 



does not vanish then either the functions uq, u±, . . . , Uk-i,Uk or the functions 
uq, u\, . . . , Uk-i, —u k form a Chebyshev system. The Chebyshev property has 
widely been used to determine explicitly c-optimal designs [see He, Studden 
and Sun (1996), Dette et al. (2003) or Dette et al. (2008) among many 
others]. On the other hand, its application to other optimality criteria has 
not been studied intensively. In the following discussion, we will demonstrate 
that this property will essentially be the reason for the occurrence of the de la 
Garza phenomenon. In particular, we will show that it is essentially sufficient 
to obtain a complete class theorem for the design problems associated with 
the nonlinear regression model (2.1). 

For this purpose, we define the index /(£) of a design ^ on the interval 
[A, B] as the number of support points, where the boundary points A and 
B (if they occur as support points) are only counted by 1/2. Recall the 
definition of the matrix C in (2.6) and denote by . . . , the different 
elements among the functions {^ij \ 1 < j,j <p}, which are not equal to the 
constant function. Throughout this paper, we assume 



(3.2) 



(3.3) 



for some I € {1, . . . ,p} and 
for all 

[see Yang (2010)]. Additionally, we put ^o(x) = 1 and assume either that 

{* ,*i,---,*fe-l} and 
{* ,*l,---,*fc-l,*fe} 



are Chebyshev systems or that 

{* ,*l,---,*fe-l} and 



(3.4) 



are Chebyshev systems then the following result characterizes the class of 
admissible designs. 
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Theorem 3.1. (1) If the functions ^o(x) = l,\E r i,...,\E r fc_i,\E r fc satisfy 
3.2) and (3.3), then for any design £ there exists a design £ + with at most 
— ^ support points, such that M(£ + ,6) > M(£,8) . If the index of the design £ 
satisfies 

no 4 

then the design £ + is uniquely determined in the class of all designs n satis- 
fying 

[■B [-B 

(3.5) / *i(x)dr)(x)= / *i(x)de(x), i = 0,...,k-l, 



and coincides with the design £. Otherwise fin the case /(£) > \], the fol- 
lowing two assertions are valid. 

(la) If k is odd, then £ + has at most support points and £ + can be 

chosen such that its support contains the point B. 
(lb) If k is even, then £ + has at most | + 1 support points and £ + can be 

chosen such that the support o/£ + contains the points A and B. 

(2) If the functions *o(z) = 1, *i, • • ■ , *fe-l,^fc satisfy (3.2) and (3.4), 
then for any design £ there exists a design £~ with at most support 
points, such that M(£~,9) > M(£, 0). If the index of the design £ satisfies 



then the design £~ is uniquely determined in the class of all designs rj satis- 
fying (3.5) and coincides with the design £. Otherwise [in the case /(£) > |/, 
the following two assertions are valid. 

(2a) If k is odd, then £~ has at most support points and £~ can be 

chosen such that its support contains the point A. 
(2b) If k is even, then £~ has at most | support points. 

Proof. We only present a proof of the first part (1) of the theorem, the 
second part follows by similar arguments. For i = 0,...,k let 

-B 



di(e)=/ *i(x)de(x) 

J A 



denote the ith "moment" and define 

4(£) = W)(£),...,4(£)) T 

as the vector of all "moments" up to the order k. Consider two designs £i 
and £2 with 

4-i (6) = 4-i (6) and 4(6) < 4(6)> 
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then for any vector z = (z\, . . . , z p ) T € W we have for some I £ {1, . . . ,p} 

z T (C(£ 2 ,6) - C(^,9))z > z?(d fc (f 2 ) - 4(6)) > 0, 
which means that 

C(6,0)>C(&,0). 

Now let for a fixed vector of "moments" 4-1 (£) 

cf£ = sup{4(r/) | r? design on [A, JB] with dk-\{rj) = d k -i{£)} 

denote the maximum of the fcth "moment" over the set of all designs with 
fixed "moments" up to the order k — 1. Due to the compactness of the design 
space and the continuity of the functions ^o> • • • > ^fc; there exists a design £ + 
such that 

(3.6) dj (t) = d j (0; j = 0,..., k-1, 

(3.7) d k (t) = d + >d k (0- 

This shows (by the argument at the beginning of the proof and the discussion 
at the end of the previous section) 

(3.8) M(£+0)>M(£,0). 

Moreover, it follows from Chapter II, Section 6 of Karlin and Studden 
(1966b) that the point 4(£ + ) is a boundary point of the "moment space" 

Mk = {dk{r{) 1 1] design on 

Consequently, we obtain from Theorem 2.1 in Karlin and Studden (1966b) 
that the design £ + is based on at most support points, which proves the 
first part of the statement. 

We now consider the cases (la) and (lb). The vector dk-i(£) is either 
a boundary point or an interior point of the (k — l)th moment space A4k-i- 
The first case is characterized by an index satisfying /(£) < fe/2 and there 
exists a unique measure £ with "moments" up to the order k specified by 
dk-i(0- To prove this statement regarding uniqueness suppose that /(£) < | 
and that there exists a further design, say £, with this property. A sim- 
ple counting argument shows that the total number of distinct points, say 
x\,...,Xt among the support points of both representations is at most k. If 
it would be less than k we could take additional support points with cor- 
responding vanishing weights and thus without less of generality, we can 
assume that the number of distinct points is equal to k. Therefore, there 
would exist k different points 

A < Xq < X\ < ■ ■ ■ < < B 
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such that 



where the matrix is given by 
/ 



^o(^o) 
*i(a? ) 



*/i = 0, 



*o(^i) 
*i(a?i) 



*i(sfc- 



\* fc _i(x ) *fe-l(^l) 
and the vector // 7^ has components 



* fe _i(a;jt_i)/ 



Mi 



10, 



x» € supp£,Xj f supp|, 
Xi supp£,Xi € supp£, 
xi G supp£ (~l supp£, 
^ supp£,Xj ^ supp£ 



(here and &{ denote the weights of the designs £ and £, resp.). Because fi 7^ 
it follows from here that det ^ = which is impossible by the definition of 
Chebyshev systems. Consequently, a design with moments specified by (3.5) 
is uniquely determined and therefore we take £ + = £, which has at most 
support points [see Theorem 2.1 in Karlin and Studden (1966b), page 42]. 

If the index of the design £ satisfies /(£) >k/2 it follows from the dis- 
cussion in Chapter II, Section 6 in Karlin and Studden (1966b) that the 
design £ + defined by (3.6) and (3.7) is the upper principal representation of 
the vector dfc_x(£), which means that its index is precisely 4 and its support 
includes the point B. Note that for this argument we require condition (3.3). 

Consequently, if k = 2m + 1 is odd, the upper principal representation £ + 
has index m + \ and precisely m + 1 support points including the point B. 
On the other hand, if k = 2m is even, £ + has m + 1 support points and the 
boundary points A and B of the design interval are support points because 
the index of the design £ + is m. 

The proof of part (2) of Theorem 3.1 is similar [where the upper principal 
representation has to be replaced by the lower principal representation using 
condition (3.4)] and omitted. □ 



Remark 3.2. (a) Note that Theorem 2.1 in Karlin and Studden [(1966b), 
Chapter II] refers to moment spaces corresponding to not necessarily bounded 
measures and the inclusion of the constant function in the system under 
consideration guarantees its application to a moment space corresponding 
to probability measures as required in the proof of Theorem 3.1. An alter- 
native explanation can be given by the generalized equivalence theorem as 
stated in Pukelsheim (2006). It follows from this result that for an optimal 
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design (with respect to the commonly used criteria) there exist some con- 
stants, say ctj £ R, % = 1, . . . , k, such that for all support points of the optimal 
design the identity 

k 

^ai^(x) = c 
i=i 

is satisfied, where c denotes a constant (e.g., for the L>-optimality criterion c 
is the number of parameters). Since an optimal design is admissible, the 
inclusion of the constant function guarantees that the index of these designs 
is at most k/2. Note that this is a sufficient but, generally speaking, not 
necessary condition. 

(b) Note that it follows from the proof of Theorem 3.1 that the conditions 
(3.6) and (3.7) imply (3.8), that is, the superiority of the information ma- 
trix of the design £ + with respect to the Loewner ordering. In many cases 
(e.g., polynomial regression models), the converse direction is also true and 
in these cases it follows from the proof of Theorem 3.1 that a design £ with 
index /(£) < | can only be "improved" (with respect to the Loewner order- 
ing of the corresponding information matrices) by itself. In fact we are not 
aware of any case where the converse direction does not hold. 

(c) Note also that Theorem 3.1 provides a solution to the problem in- 
dicated in the example of the Introduction. In the linear regression model 
we have k = 2, therefore we can use the given design £x (concentrating all 
observations at x = 0) as an "improvement" of £i. However, because the 
index of £i is 1/2 < 1 the design £i can only be improved by itself (see the 
previous remark). In particular, there does not exist a design £ which takes 
observations at x = 1 and improves £i in the sense M(£) > M(£i). 

(d) It is also worthwhile to mention that a design improving the given 
design £ is not necessarily unique. Consider, for example, again the linear 
regression model on the interval [0, 1] and the design £ which has equal 
masses at the points and 3/4. The information matrix of £ is given by 

M(0=(l |Y 

\ 8 32 / 

Now define for any p £ f^, §] a design with masses p and 1 — p at the 
points and s (i_ p \ > respectively. Then it follows that 

M &={\ 9 ] 
\8 64(1 -p)/ 

and M(£+) > M(£) for any p <G [|, |]. Note that the choice p= | gives the 
upper principal representation £ + = with index 1 and support points 
and 1, while for p G [i, |) we have index I(£p) = 3/2. 
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In the remaining part of this section, we will relate the result of The- 
orem 3.1 to the recent findings of Yang (2010). Note that — in contrast to 
Theorem 1 and 2 of Yang (2010) — our Theorem 3.1 does not require the 
differentiability of the functions *$>j. Moreover, in some cases it provides 
a better description of the admissible designs. For a more detailed expla- 
nation, we note that a Chebyshev system of functions {uq, . . . ,u k } is called 
an extended Chebyshev system, if and only if for any ao, • • • ,a k € M with 
J2i=o a i 7^ ^e function 

k 

y^ u aiUj(x) 



has at most k zeros counted with multiplicities in the interval [A, B] . Note 
that this definition is equivalent to the definition given in Karlin and Stridden 
(1966b). It is in fact proved in Karlin and Studden [(1966b), Section 1.2] for 
the case of system Ui(t) = t l ,i = 0, . . . , n. And the argument can be applied 
for general case. Moreover, by definition, an extended Chebyshev system is 
always a Chebyshev system. 

A simple way of constructing an extended Chebyshev system is the follow- 
ing [see Karlin and Studden (1966b), page 19]. Let wq, . . . ,w k be functions 
on the interval [A, B] which are either positive or negative. We now consider 
the new functions 



(3.9) 



u (x) = w (x), 

ui(x) = w (x) / wi(ti)dt l , 
J A 



U k (x) = Wq 



px 

(x) / wx(tx) 

J A 



w 2 (h 



Wk(tk)dt k ■■■ dti. 



'A J A 

A direct calculation shows that the Wronskian determinant of the functions 
u ,...,u k is given by 



W x (u Q ,...,u k ) 



(3.10) 



Uq(x) u' q (x) 

U\ (x) Ui (x) 
u k (x) u' k (x) 

\k+l 



U 



(*), 
1 

,( fc ), 



,( fc ), 



x) 
x) 



x) 



(w Q (x)) k+L ( Wl (x)) k ■ ■ ■ (w k -i(x)yw k (x) 



and it is shown in Chapter XI in Karlin and Studden (1966b) that the 
set {uq, . . . , u k } of A: times differentiable function is an extended Chebyshev 
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system if and only if 

W x (u , ...,u k )>0 

for all x £ [A, B\. On the other hand, this representation provides a construc- 
tive method for checking if a given system of k times differentiable functions 
{ito, . . . , itfc} is a Chebyshev system on the interval [A, B]. To be precise, 
define wq(x) = uo(x) and recursively differential operators 

(3.1D nj ±{Ly. , /, 

(3.12) w j+1 = (D j D j - 1 ---D )u j+ i; j = 0,1,..., fc- 1. 

Consequently, the set {uq, ■ ■ ■ , u^} is a Chebyshev system if the functions wo, 
. . . , u>k calculated by (3.11) and (3.12) are all positive on the interval [A, B]. 

Remark 3.3. Yang (2010) constructed a triangle array of functions 
{fi,t | t = 1, . . . , k;t < I < k} from the functions \Px > • • • > \Pfc induced by the 
nonlinear regression model (2.1) using the recursion 

f*{(x), t = l,...,k, 

M x ) = \ 2<t<k;t<l<k. 
K \ft-i,t-i(x) J 

It is now easy to see that the functions w\, . . . , Wk obtained from (3.11) 
and (3.12) with wq = 1, Uj = ^fj (j = 1, . . . , k) are precisely the functions /;/ 
defined by Yang (2010). As a consequence, we will obtain the main result of 
Yang (2010) as a special case of our Theorem 3.1 (note that our assumptions 
regarding the differentiability are slightly weaker than in this reference). 

Theorem 3.4. Let ^i,...,^>k denote the k different functions in the 
information matrix (3.1) corresponding to the nonlinear regression model 
which are not equal to the constant function. Assume that ^$>j is (j + 1) 
times continuously differentiable, define wo = l and for j = 0, . . . , k — 1 

Wj+i = DjDj-i ■ ■ ■ -Do^i+i 

and assume that condition (3.2) is satisfied. If 

F(x) = wi (x) ■ ■ ■ Wk (x) 

for all x € [A, B], then for any given design £ there exists a design £, such 
that < \ 

M(i,e)>M(t,e). 

If the index of the design £ satisfies /(£) < | then £ is uniquely determined 
in the class of all designs r/ with moments specified by (3.5) and coincides 
with the design £. Otherwise [in the case /(£) > |/ the following assertions 
are valid. 
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(la) If k is odd and F(x) < on the interval [A, B\, then the design £ has 

at most {k + l)/2 support points and £ can be chosen such that the 

point A is a support point. 
(lb) If k is odd and F(x) > on the interval [A,B], then the design £ has 

at most (k + l)/2 support points and £ can be chosen such that the 

point B is a support point. 
(2a) If k is even and F(x) < on the interval [A,B], then the design £ has 

at most k/2 support points. 
(2b) If k is even and F(x) > on the interval [A, B), then the design £ 

has at most k/2 + 1 support points and £ can be chosen such that the 

points A and B are support points. 



Proof. Let us define ^o(x) = 1 and note that 



F{x) 



Wsp&o,...,**-!)" 



Thus if F(x) > then condition (3.3) is fulfilled and if F(x) < 0, then 
condition (3.4) is fulfilled. Now Theorem 3.4 is an immediate corollary of 
Theorem 3.1. □ 



Remark 3.5. Note that if the constant function appears among the 
different functions {^ij | 1 < i < j < p} in the information matrix (3.1) it is 
not counted in Theorem 3.4 or Theorem 2 of Yang (2010) (see the proof of 
Theorems 3 and 5-7 in this reference). 

A number of interesting applications of Theorem 3.4 are given in Yang 
(2010). Note that in all examples considered there the functions under con- 
sideration generate a special type of Chebyshev systems, namely extended 
Chebyshev systems that can be generated by formulas (3.7). This follows 
from Remark 3.3 and the discussion before Theorem 3.4. Note that several 
other interesting examples for the case of two parameters are given in Yang 
and Stufken (2009). All these examples are based on Lemma 1 from that 
paper and the conditions of this lemma are in fact imply that the system of 
the three functions (corresponding to different elements of the information 
matrix) is an extended Chebyshev system. Thus, these examples can also 
be considered as particular cases of Theorem 3.1. 

The main advantage of Theorem 3.1 consists in the fact that the de la 
Garza phenomenon can be established by proving that the system under 
consideration is a Chebyshev system. For this purpose, several methods are 
available which differ from the approach presented in Yang (2010) and in 
the next section we will consider an example illustrating the usefulness of 
Theorem 3.1. 



DE LA GARZA PHENOMENON FOR LOCALLY OPTIMAL DESIGNS 



13 



(l-l) 



4. An application to rational regression models. In this section, we present 
a class of nonlinear regression models where Theorem 3.4 [or Theorem 2 in 
Yang (2010)] is not directly applicable, but the de la Garza phenomenon 
can be established by an application of Theorem 3.1. For this purpose, we 
consider rational regression models of the form 

P(x,e (1) ) 

(4-1) ^) = or^Tl' 

where 

P(x, 0(i)) = 0i + 9 2 x H h Oix 

Q(x,e {2) ) = i + e l+lX + ■ ■ ■ + e s+ ix s 

are polynomials of degree I — 1 and s, respectively, with corresponding pa- 
rameters 

0(1) = (01) • • • >0«) T ) 0(2) = (0i+i) • • • )0«+s) T - 

It is shown in He, Studden and Sun (1996) that the information matrix for 
this model can be written in the form 

M(£,0) = B(0)C(£, 0)5(0), 

where = (0i, . . . , 0i+ s ) T , B denotes an appropriate matrix [see He, Studden 
and Sun (1996)], the matrix C is given by 

C(£,0)= f B [l/Q\x)]h{x)h{x) T di{x), 

J A 

h(x) = (1, x, . . . , x p ~ 1 ) T denotes the vector of monomials with p = I + s and 
Q(x) is a polynomial of degree s. Therefore, it follows that the different 
functions in the information matrix are given by 

*i(x) = l/Q\x), . . . , * fc (x) = x k ~ l /Q\x), 

where k = 2p — 1. Define ^o( x ) = 1) then it is well known [see Karlin and 
Studden (1966a)] that under the conditions: 

(a) Q(x) does not vanish in the interval [j4,S]; 

(b) [Q 4 (x)]( 2p-1 ) does not vanish in the interval [A, B] 

the functions ^0) ^lj ■ • • , ^2p-i generate a Chebyshev system on the interval 
[A, B] and Theorem 3.1 is applicable here. 

However, we will give an alternative proof of this property which yields — 
as a by-product — a constructive condition under which the condition (b) is 
fulfilled. Assume that Q 4 (x) > for all x £ [A, B] and note that a Chebyshev 
system remains a Chebyshev system after multiplication of all functions by 
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a positive function. Thus, in order to apply Theorem 3.1 it is sufficient to 
prove that the functions 

1 ; x ; x } . . . } x ^ . (*^ ) 

generate a Chebyshev system on the interval [A, B\. The following lemma 
provides a sufficient condition for this property. 

Lemma 4.1. Assume that the polynomial Q(x) has only real roots which 
are either all smaller than A or larger than B. If s > 1 — 1, then the functions 

l,x,x , . . . , x p ,eQ (x), 

generate a Chebyshev system on the interval [A,B], where e = +1 if the roots 
are smaller than A and e = — 1 if the roots larger than B. 

Proof. Based on the assumptions about Q(x), the polynomial Q 4 (x) 
can be written as cQ^^x — aj), where are not necessary distinct. Clearly, 
(Q 4 (x))^ = cY^A k TljeA k ( x ~ Q i)' wnere Ah is the set of all possible sub- 
sets of {1, . . . , 4s} with 4s — k elements. Define x m { n and x max as the smallest 
and largest root of Q(x), then all derivatives of Q 4 (x) of even order less than 
4s — 1 are positive outside of the interval [x m i n ,x max \. Define uq{x) = 1, 
u±(x) = x, . . . ,U2 P -2(%) = x 2p ~ 2 , U2 P -i(x) = Q 4l {x). By formulas (3.11) and 
(3.12), we can easily calculate that wq{x) = l,Wj(x) = j,j = l,...,2p — 2, 
W2 P -i(x) = [Q 4 (x)]^ 2p ~ 1 \ Thus, if s > I — 1 it follows that W2 P ~i{x) is ne- 
gative for x < x m ; n and positive for A > x > x max . Therefore (note that 
[Q 4 (x)]( 2p_1 ) has no roots in the interval L4,l?]), we have W2 P -i(x) > for 
all x 6 [A, B\. Now the assertion of Lemma 4.1 follows from the formula 
for the Wronskian determinant in (3.10) and the fact that a positive Wron- 
skian determinant is sufficient for the Chebyshev property of the functions 

Uo,-..,U 2 p-l. □ 

The following result is now an immediate consequence of Lemma 4.1 and 
Theorem 3.1 (note that we do not repeat the statement of uniqueness of the 
latter result). 

Theorem 4.2. Consider the rational regression model (4-1)- Assume 
that s > I — 1 and that the polynomial Q(x) has only real roots, which are 
either all smaller than A or larger than B. Then for any design £ there 
exists a design £ with at most p support points, such that M(£,8) < M(£,(9). 
Moreover: 

(1) if the index of £ satisfies /(£) >p — \ and all roots of the polynomial 
Q are smaller than A, then £ can be chosen such that the support of ^ 
contains the point A, 
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(2) if the index of £ satisfies >p — \ and all roots of the polynomial 
Q are larger than B, then £ can be chosen such that the support of £ 
contains the point B. 

Remark 4.3. (a) Theorem 4.2 is an extension of Theorem 5 in He, 
Studden and Sun (1996) who investigated only locally D-optimal designs. 

(b) Note that Yang (2010) considered the classical weighted polynomial 
regression model where the different functions in the information matrix are 
given by *&j(x) = \{x)x^~ , j = 1, . . . , 2p — 1, where A is a positive function 
on the interior of the design space, which is called efficiency function [see 
Dette and Trampisch (2010)]. His findings can be generalized in the following 
way. If there exists a function g(x) such that 

(4.2) (±{\{ x ) x ^)/ g { x )\ = Cj , g(x)>0,xe[A,B], 

for some constants Cj £ M \ {0}, j = 1, . . . , 2p — 1, then one can denote 

*i(aO = f g(t)dt, = j = l,...,2p-l, 

Jo 

and obtains a system of functions satisfying the assumptions of Theorem 3.4. 
In particular, in Theorem 9 of Yang (2010) for the case X(x) = exp(rr 2 ) the 
function g{x) = X(x) = exp(x 2 ) is appropriate, while the case X(x) = (1 — 
x) a+1 (l + x)@ +1 , a > — 1, P > — 1 requires the choice g(x) = (1 — x) a (l + x)P . 
Moreover, the differential equation (4.2) shows that there are many other 
efficiency functions for which the de la Garza phenomenon in the weighted 
polynomial regression model occurs. For example, if X{x) = 1/(1 + x) n , A > 
— 1 , n > 2p — 2 one could use 

g(x) = l/(l + x)^ 

and it follows that for the weighted polynomial regression model with this 
efficiency function any optimal design can be based on at most p points. 
However, for the rational model of the form (4.1) such a technique seem- 
ingly does not work. The alternative way is to prove that the functions 
l,x, . . . ,x k , X(x)~ l generate a Chebyshev system and to use the new Theo- 
rem 3.1 to establish the de la Garza phenomenon. Such a method has been 
realized for the rational model (4.1) in the proof of Theorem 4.2. 
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